38 research outputs found
A Statistical Graphical Model of the California Reservoir System
The recent California drought has highlighted the potential vulnerability of the state's water management infrastructure to multiyear dry intervals. Due to the high complexity of the network, dynamic storage changes in California reservoirs on a state-wide scale have previously been difficult to model using either traditional statistical or physical approaches. Indeed, although there is a significant line of research on exploring models for single (or a small number of) reservoirs, these approaches are not amenable to a system-wide modeling of the California reservoir network due to the spatial and hydrological heterogeneities of the system. In this work, we develop a state-wide statistical graphical model to characterize the dependencies among a collection of 55 major California reservoirs across the state; this model is defined with respect to a graph in which the nodes index reservoirs and the edges specify the relationships or dependencies between reservoirs. We obtain and validate this model in a data-driven manner based on reservoir volumes over the period 2003–2016. A key feature of our framework is a quantification of the effects of external phenomena that influence the entire reservoir network. We further characterize the degree to which physical factors (e.g., state-wide Palmer Drought Severity Index (PDSI), average temperature, snow pack) and economic factors (e.g., consumer price index, number of agricultural workers) explain these external influences. As a consequence of this analysis, we obtain a system-wide health diagnosis of the reservoir network as a function of PDSI
The Helioseismic and Magnetic Imager (HMI) Vector Magnetic Field Pipeline: SHARPs -- Space-weather HMI Active Region Patches
A new data product from the Helioseismic and Magnetic Imager (HMI) onboard
the Solar Dynamics Observatory (SDO) called Space-weather HMI Active Region
Patches (SHARPs) is now available. SDO/HMI is the first space-based instrument
to map the full-disk photospheric vector magnetic field with high cadence and
continuity. The SHARP data series provide maps in patches that encompass
automatically tracked magnetic concentrations for their entire lifetime; map
quantities include the photospheric vector magnetic field and its uncertainty,
along with Doppler velocity, continuum intensity, and line-of-sight magnetic
field. Furthermore, keywords in the SHARP data series provide several
parameters that concisely characterize the magnetic-field distribution and its
deviation from a potential-field configuration. These indices may be useful for
active-region event forecasting and for identifying regions of interest. The
indices are calculated per patch and are available on a twelve-minute cadence.
Quick-look data are available within approximately three hours of observation;
definitive science products are produced approximately five weeks later. SHARP
data are available at http://jsoc.stanford.edu and maps are available in either
of two different coordinate systems. This article describes the SHARP data
products and presents examples of SHARP data and parameters.Comment: 27 pages, 7 figures. Accepted to Solar Physic
Towards real-time classification of astronomical transients
Exploration of time domain is now a vibrant area of research in astronomy, driven by the advent of digital synoptic sky surveys. While panoramic surveys can detect variable or transient events, typically some follow-up observations are needed; for short-lived phenomena, a rapid response is essential. Ability to automatically classify and prioritize transient events for follow-up studies becomes critical as the data rates increase. We have been developing such methods using the data streams from the Palomar-Quest survey, the Catalina Sky Survey and others, using the VOEventNet framework. The goal is to automatically classify transient events, using the new measurements, combined with archival data (previous and multi-wavelength measurements), and contextual information (e.g., Galactic or ecliptic latitude, presence of a possible host galaxy nearby, etc.); and to iterate them dynamically as the follow-up data come in (e.g., light curves or colors). We have been investigating Bayesian methodologies for classification, as well as discriminated follow-up to optimize the use of available resources, including Naive Bayesian approach, and the non-parametric Gaussian process regression. We will also be deploying variants of the traditional machine learning techniques such as Neural Nets and Support Vector Machines on datasets of reliably classified transients as they build up
A Statistical Graphical Model of the California Reservoir System
The recent California drought has highlighted the potential vulnerability of the state's water management infrastructure to multiyear dry intervals. Due to the high complexity of the network, dynamic storage changes in California reservoirs on a state-wide scale have previously been difficult to model using either traditional statistical or physical approaches. Indeed, although there is a significant line of research on exploring models for single (or a small number of) reservoirs, these approaches are not amenable to a system-wide modeling of the California reservoir network due to the spatial and hydrological heterogeneities of the system. In this work, we develop a state-wide statistical graphical model to characterize the dependencies among a collection of 55 major California reservoirs across the state; this model is defined with respect to a graph in which the nodes index reservoirs and the edges specify the relationships or dependencies between reservoirs. We obtain and validate this model in a data-driven manner based on reservoir volumes over the period 2003–2016. A key feature of our framework is a quantification of the effects of external phenomena that influence the entire reservoir network. We further characterize the degree to which physical factors (e.g., state-wide Palmer Drought Severity Index (PDSI), average temperature, snow pack) and economic factors (e.g., consumer price index, number of agricultural workers) explain these external influences. As a consequence of this analysis, we obtain a system-wide health diagnosis of the reservoir network as a function of PDSI
Automated Real-Time Classification and Decision Making in Massive Data Streams from Synoptic Sky Surveys
The nature of scientific and technological data collection is evolving
rapidly: data volumes and rates grow exponentially, with increasing complexity
and information content, and there has been a transition from static data sets
to data streams that must be analyzed in real time. Interesting or anomalous
phenomena must be quickly characterized and followed up with additional
measurements via optimal deployment of limited assets. Modern astronomy
presents a variety of such phenomena in the form of transient events in digital
synoptic sky surveys, including cosmic explosions (supernovae, gamma ray
bursts), relativistic phenomena (black hole formation, jets), potentially
hazardous asteroids, etc. We have been developing a set of machine learning
tools to detect, classify and plan a response to transient events for astronomy
applications, using the Catalina Real-time Transient Survey (CRTS) as a
scientific and methodological testbed. The ability to respond rapidly to the
potentially most interesting events is a key bottleneck that limits the
scientific returns from the current and anticipated synoptic sky surveys.
Similar challenge arise in other contexts, from environmental monitoring using
sensor networks to autonomous spacecraft systems. Given the exponential growth
of data rates, and the time-critical response, we need a fully automated and
robust approach. We describe the results obtained to date, and the possible
future developments.Comment: 8 pages, IEEE conference format, to appear in the refereed
proceedings of the IEEE e-Science 2014 conf., eds. C. Medeiros et al., IEEE,
in press (2014). arXiv admin note: substantial text overlap with
arXiv:1209.1681, arXiv:1110.465
Flashes in a Star Stream: Automated Classification of Astronomical Transient Events
An automated, rapid classification of transient events detected in the modern
synoptic sky surveys is essential for their scientific utility and effective
follow-up using scarce resources. This presents some unusual challenges: the
data are sparse, heterogeneous and incomplete; evolving in time; and most of
the relevant information comes not from the data stream itself, but from a
variety of archival data and contextual information (spatial, temporal, and
multi-wavelength). We are exploring a variety of novel techniques, mostly
Bayesian, to respond to these challenges, using the ongoing CRTS sky survey as
a testbed. The current surveys are already overwhelming our ability to
effectively follow all of the potentially interesting events, and these
challenges will grow by orders of magnitude over the next decade as the more
ambitious sky surveys get under way. While we focus on an application in a
specific domain (astrophysics), these challenges are more broadly relevant for
event or anomaly detection and knowledge discovery in massive data streams.Comment: 8 pages, to appear in refereed proceedings of the IEEE eScience 2012
conference, October 2012, IEEE Pres
Modeling groundwater levels in California's Central Valley by hierarchical Gaussian process and neural network regression
Modeling groundwater levels continuously across California's Central Valley
(CV) hydrological system is challenging due to low-quality well data which is
sparsely and noisily sampled across time and space. A novel machine learning
method is proposed for modeling groundwater levels by learning from a 3D
lithological texture model of the CV aquifer. The proposed formulation performs
multivariate regression by combining Gaussian processes (GP) and deep neural
networks (DNN). Proposed hierarchical modeling approach constitutes training
the DNN to learn a lithologically informed latent space where non-parametric
regression with GP is performed. The methodology is applied for modeling
groundwater levels across the CV during 2015 - 2020. We demonstrate the
efficacy of GP-DNN regression for modeling non-stationary features in the well
data with fast and reliable uncertainty quantification. Our results indicate
that the 2017 and 2019 wet years in California were largely ineffective in
replenishing the groundwater loss caused during previous drought years.Comment: Submitted to Water Resources Researc
Classification of Optical Transients: Experiences from PQ and CRTS Surveys
Synoptic sky surveys are opening up exciting opportunities in time domain astronomy. Gaia will make a great contribution to this field. A crucial factor for good scientific returns is real-time classification of transients, in order to optimize their follow-up. We have been developing infrastructure towards this end starting from the completed Palomar-Quest (PQ) survey, and the ongoing Catalina Real-Time Transient Survey (CRTS). CRTS has been consistently producing transients for almost three years now. We describe here the efforts related to transient classification and event dissemination. Many of the technologies and methodologies we are developing may benefit Gaia